9 research outputs found
Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining Representations for Cyberbullying Classification
Cyberbullying is a pervasive problem in online communities. To identify
cyberbullying cases in large-scale social networks, content moderators depend
on machine learning classifiers for automatic cyberbullying detection. However,
existing models remain unfit for real-world applications, largely due to a
shortage of publicly available training data and a lack of standard criteria
for assigning ground truth labels. In this study, we address the need for
reliable data using an original annotation framework. Inspired by social
sciences research into bullying behavior, we characterize the nuanced problem
of cyberbullying using five explicit factors to represent its social and
linguistic aspects. We model this behavior using social network and
language-based features, which improve classifier performance. These results
demonstrate the importance of representing and modeling cyberbullying as a
social phenomenon.Comment: 12 pages, 5 figures, 22 tables, Accepted to the 14th International
AAAI Conference on Web and Social Media, ICWSM'2
TADA: Task-Agnostic Dialect Adapters for English
Large Language Models, the dominant starting point for Natural Language
Processing (NLP) applications, fail at a higher rate for speakers of English
dialects other than Standard American English (SAE). Prior work addresses this
using task-specific data or synthetic data augmentation, both of which require
intervention for each dialect and task pair. This poses a scalability issue
that prevents the broad adoption of robust dialectal English NLP. We introduce
a simple yet effective method for task-agnostic dialect adaptation by aligning
non-SAE dialects using adapters and composing them with task-specific adapters
from SAE. Task-Agnostic Dialect Adapters (TADA) improve dialectal robustness on
4 dialectal variants of the GLUE benchmark without task-specific supervision.Comment: 5 Pages; ACL Findings Paper 202
Impressions: Understanding Visual Semiotics and Aesthetic Impact
Is aesthetic impact different from beauty? Is visual salience a reflection of
its capacity for effective communication? We present Impressions, a novel
dataset through which to investigate the semiotics of images, and how specific
visual features and design choices can elicit specific emotions, thoughts and
beliefs. We posit that the impactfulness of an image extends beyond formal
definitions of aesthetics, to its success as a communicative act, where style
contributes as much to meaning formation as the subject matter. However, prior
image captioning datasets are not designed to empower state-of-the-art
architectures to model potential human impressions or interpretations of
images. To fill this gap, we design an annotation task heavily inspired by
image analysis techniques in the Visual Arts to collect 1,440 image-caption
pairs and 4,320 unique annotations exploring impact, pragmatic image
description, impressions, and aesthetic design choices. We show that existing
multimodal image captioning and conditional generation models struggle to
simulate plausible human responses to images. However, this dataset
significantly improves their ability to model impressions and aesthetic
evaluations of images through fine-tuning and few-shot adaptation.Comment: To be published in EMNLP 202
NormBank: A Knowledge Bank of Situational Social Norms
We present NormBank, a knowledge bank of 155k situational norms. This
resource is designed to ground flexible normative reasoning for interactive,
assistive, and collaborative AI systems. Unlike prior commonsense resources,
NormBank grounds each inference within a multivalent sociocultural frame, which
includes the setting (e.g., restaurant), the agents' contingent roles (waiter,
customer), their attributes (age, gender), and other physical, social, and
cultural constraints (e.g., the temperature or the country of operation). In
total, NormBank contains 63k unique constraints from a taxonomy that we
introduce and iteratively refine here. Constraints then apply in different
combinations to frame social norms. Under these manipulations, norms are
non-monotonic - one can cancel an inference by updating its frame even
slightly. Still, we find evidence that neural models can help reliably extend
the scope and coverage of NormBank. We further demonstrate the utility of this
resource with a series of transfer experiments
Multi-VALUE: A Framework for Cross-Dialectal English NLP
Dialect differences caused by regional, social, and economic factors cause
performance discrepancies for many groups of language technology users.
Inclusive and equitable language technology must critically be dialect
invariant, meaning that performance remains constant over dialectal shifts.
Current systems often fall short of this ideal since they are designed and
tested on a single dialect: Standard American English (SAE). We introduce a
suite of resources for evaluating and achieving English dialect invariance. The
resource is called Multi-VALUE, a controllable rule-based translation system
spanning 50 English dialects and 189 unique linguistic features. Multi-VALUE
maps SAE to synthetic forms of each dialect. First, we use this system to
stress tests question answering, machine translation, and semantic parsing.
Stress tests reveal significant performance disparities for leading models on
non-standard dialects. Second, we use this system as a data augmentation
technique to improve the dialect robustness of existing systems. Finally, we
partner with native speakers of Chicano and Indian English to release new
gold-standard variants of the popular CoQA task. To execute the transformation
code, run model checkpoints, and download both synthetic and gold-standard
dialectal benchmark datasets, see http://value-nlp.org.Comment: ACL 202
CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation
Annotated data plays a critical role in Natural Language Processing (NLP) in
training models and evaluating their performance. Given recent developments in
Large Language Models (LLMs), models such as ChatGPT demonstrate zero-shot
capability on many text-annotation tasks, comparable with or even exceeding
human annotators. Such LLMs can serve as alternatives for manual annotation,
due to lower costs and higher scalability. However, limited work has leveraged
LLMs as complementary annotators, nor explored how annotation work is best
allocated among humans and LLMs to achieve both quality and cost objectives. We
propose CoAnnotating, a novel paradigm for Human-LLM co-annotation of
unstructured texts at scale. Under this framework, we utilize uncertainty to
estimate LLMs' annotation capability. Our empirical study shows CoAnnotating to
be an effective means to allocate work from results on different datasets, with
up to 21% performance improvement over random baseline. For code
implementation, see https://github.com/SALT-NLP/CoAnnotating
Can Large Language Models Transform Computational Social Science?
Large Language Models (LLMs) like ChatGPT are capable of successfully
performing many language processing tasks zero-shot (without the need for
training data). If this capacity also applies to the coding of social phenomena
like persuasiveness and political ideology, then LLMs could effectively
transform Computational Social Science (CSS). This work provides a road map for
using LLMs as CSS tools. Towards this end, we contribute a set of prompting
best practices and an extensive evaluation pipeline to measure the zero-shot
performance of 13 language models on 24 representative CSS benchmarks. On
taxonomic labeling tasks (classification), LLMs fail to outperform the best
fine-tuned models but still achieve fair levels of agreement with humans. On
free-form coding tasks (generation), LLMs produce explanations that often
exceed the quality of crowdworkers' gold references. We conclude that today's
LLMs can radically augment the CSS research pipeline in two ways: (1) serving
as zero-shot data annotators on human annotation teams, and (2) bootstrapping
challenging creative generation tasks (e.g., explaining the hidden meaning
behind text). In summary, LLMs can significantly reduce costs and increase
efficiency of social science analysis in partnership with humans